Search CORE

5 research outputs found

Armed Cats: formal concurrency modelling at Arm

Author: Alglave Jade
Deacon Will
Grisenthwaite Richard
Hacquard Antoine
Maranget Luc
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2021
Field of study

International audienceWe report on the process for formal concurrency modelling at Arm. An initial formal consistency model of the Arm achitecture, written in the cat language, was published and upstreamed to the herd+diy tool suite in 2017. Since then, we have extended the original model with extra features, for example mixed-size accesses, and produced two provably equivalent alternative formulations. In this paper, we present a comprehensive review of work done at Arm on the consistency model. Along the way, we also show that our principle for handling mixed-size accesses applies to x86: we confirm this via vast experimental campaigns. We also show that our alternative formulations are applicable to any model phrased in a style similar to the one chosen by Arm

INRIA a CCSD electronic archive server

FP8 Formats for Deep Learning

Author: Burgess Neil
Cornea Marius
Dubey Pradeep
Grisenthwaite Richard
Ha Sangwon
Heinecke Alexander
Judd Patrick
Kamalu John
Mellempudi Naveen
Micikevicius Paulius
Oberman Stuart
Shoeybi Mohammad
Siu Michael
Stosic Dusan
Wu Hao
Publication venue
Publication date: 29/09/2022
Field of study

FP8 is a natural progression for accelerating deep learning training inference beyond the 16-bit formats common in modern processors. In this paper we propose an 8-bit floating point (FP8) binary interchange format consisting of two encodings - E4M3 (4-bit exponent and 3-bit mantissa) and E5M2 (5-bit exponent and 2-bit mantissa). While E5M2 follows IEEE 754 conventions for representatio of special values, E4M3's dynamic range is extended by not representing infinities and having only one mantissa bit-pattern for NaNs. We demonstrate the efficacy of the FP8 format on a variety of image and language tasks, effectively matching the result quality achieved by 16-bit training sessions. Our study covers the main modern neural network architectures - CNNs, RNNs, and Transformer-based models, leaving all the hyperparameters unchanged from the 16-bit baseline training sessions. Our training experiments include large, up to 175B parameter, language models. We also examine FP8 post-training-quantization of language models trained using 16-bit formats that resisted fixed point int8 quantization

arXiv.org e-Print Archive

Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory

Author: Alglave Jade
Bertot Yves
Boudol Gérard
Burckhardt Sebastian
Collier William
Compaq Computer Corp. 2002.
Grisenthwaite Richard
Howells David
IBM Corp. 2009.
Intel Corp. 2002.
Intel Corp. 2009.
Kuperstein Michael
Ltd ARM
Ltd ARM
Nardelli Francesco Zappa
Neiger Gil
Paul
SPARC International Inc. 1992.
SPARC International Inc. 1994.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

We propose an axiomatic generic framework for modelling weak memory. We show how to instantiate this framework for SC, TSO, C++ restricted to release-acquire atomics, and Power. For Power, we compare our model to a preceding operational model in which we found a flaw. To do so, we define an operational model that we show equivalent to our axiomatic model. We also propose a model for ARM. Our testing on this architecture revealed a behaviour later acknowl-edged as a bug by ARM, and more recently 31 additional anomalies. We offer a new simulation tool, called herd, which allows the user to specify the model of his choice in a concise way. Given a specification of a model, the tool becomes a simulator for that model. The tool relies on an axiomatic description; this choice allows us to outperform all previous simulation tools. Additionally, we confirm that verification time is vastly improved, in the case of bounded model checking. Finally, we put our models in perspective, in the light of empirical data obtained by analysing the C and C++ code of a Debian Linux distribution. We present our new analysis tool, called mole, which explores a piece of code to find the weak memory idioms that it uses

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Queen Mary Research Online

Recommended from our members

Relaxed virtual memory in Armv8-A

Author: Armstrong Alasdair
Grisenthwaite Richard
Pichon-Pharabod Jean
Pulte Christopher
Sewell Peter
Simner Ben
Publication venue: Department of Computer Science And Technology
Publication date: 01/02/2022
Field of study

Virtual memory is an essential mechanism for enforcing security boundaries, but its relaxed-memory concurrency semantics has not previously been investigated in detail. The concurrent systems code managing virtual memory has been left on an entirely informal basis, and OS and hypervisor verification has had to make major simplifying assumptions. We explore the design space for relaxed virtual memory semantics in the Armv8-A architecture, to support future system-software verification. We identify many design questions, in discussion with Arm; develop a test suite, including use cases from the pKVM production hypervisor under development by Google; delimit the design space with axiomatic-style concurrency models; prove that under simple stable configurations our architectural model collapses to previous "user" models; develop tooling to compute allowed behaviours in the model integrated with the full Armv8-A ISA semantics; and develop a hardware test harness. This lays out some of the main issues in relaxed virtual memory, bringing these security-critical systems phenomena into the domain of programming-language semantics and verification, with foundational architecture semantics, for the first time.This work was partially funded by an Arm/EPSRC iCASE PhD studentship (Simner), Arm Limited, Google, ERC Advanced Grant (AdG) 789108 ELVER, and the UK Government Industrial Strategy Challenge Fund (ISCF) under the Digital Security by Design (DSbD) Programme, to deliver a DSbDtech enabled digital platform (grant 105694)

Apollo (Cambridge)

Experimental Farming and Ricardo's Political Arithmetic of Distribution

Author: Adam Smith
Alexander Beatson
Arnold Heertje
B R Mitchell
Barkai
Boyd
Cunningham Wood
D V Ramana
David Ricardo
David Ricardo
David Ricardo
David Weatherall
Dennis P O&apos
Dennis P O&apos
Donald Winch
Edward West
G Reuten
Gordon E Mingay
Gordon E Mingay
Gordon E Mingay
House Of Lords
Humphry Davy
J M Pullen
James P Henderson
John E Russell
John P Henderson
Joseph A Schumpeter
Liam Brunt
M Reich
Mark Blaug
Mark Blaug
Mark Overton
Martin J S Rudwick
Mary S. Morgan
Michael E J V Turner
Michael J Gootzeit
Mingay Gordon
Murray &amp
N M Herbert
Nicholas Goddard
Patricia James
Piero Sraffa
Redvers Opie
Richard Jones
Robert Allen
Robert Dorfman
Robert Torrens
Salim Rashid
Samuel Hollander
Samuel Hollander
Sergio &amp
Sergio Cramaschi
Simon N Patten
Terry Peach
Terry Peach
Thomas R Malthus
Thomas R Malthus
Thomas R Malthus
Timothy Davis
William Grisenthwaite
William Marshall
William Marshall
William Whewell
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Crossref